Segmentation of text/image documents using texture approaches
نویسنده
چکیده
The digital computer and computer networks have made it possible to search for and retrieve electronically stored documents in seconds, no matter where in the world they are stored. This is far from the reality for documents stored as paper copies. Therefore there is considerable interest in digitizing paper documents. To digitize existing paper documents, it is of great importance to be able to separate the text from the graphics, in order to make the text searchable and more eeciently stored. In this paper we present an approach to segmentation of text and graphics in scanned documents, based on the assumption that the text in a document may be viewed as one texture, while the graphics is a diierent texture. Using this assumption, we segment the documents with a texture segmentation scheme using lter banks as the feature extractors. While most traditional text-graphics segmentation schemes assumes some a priori knowledge of the input, our approach is independent of document layout, typeface, font size, scanning resolution etc. Another approach to texture segmentation of documents for text-graphics segmentation has been presented by Jain and Bhattacharjee, using the Gabor l-ter as the feature extractor. In this paper we show that equally good results may be obtained using much more computationally eecient critically sampled perfect reconstruction lter banks.
منابع مشابه
Unsupervised Texture Image Segmentation Using MRFEM Framework
Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...
متن کاملUnsupervised Texture Image Segmentation Using MRFEM Framework
Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملSegmentation of text/graphic from handwritten mathematical documents using gabor filter
Most of handwritten mathematical documents contain graphics in addition to mathematical text. Thus, these documents must be segmented into homogenous areas to facilitate their digitization. Text and graphic segmentation from these documents aims at segmenting the document into two blocks: the first contains the texts and the second includes the graphical objects. In this paper, we focus our int...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کامل